专利摘要:
The signal processing module comprises at least one operating unit (34) incorporating calculation units (31), input and output interfaces (52, 53) adapted to be connected to a bus and a memory (30). storing data for said computing units, the memory (30) being organized such that each data word is stored by column on several addresses in an application-dependent order, a column having a width of one bit , the words being transferred in series to said calculation units (31).
公开号:FR3015068A1
申请号:FR1362859
申请日:2013-12-18
公开日:2015-06-19
发明作者:Marc Duranton;Jean-Marc Philippe
申请人:Commissariat a lEnergie Atomique CEA;Commissariat a lEnergie Atomique et aux Energies Alternatives CEA;
IPC主号:
专利说明:

[0001] The present invention relates to a signal processing module, in particular able to implement algorithms of the neural network type. BACKGROUND OF THE INVENTION The invention also relates to a neural circuit. It applies in particular for the implementation of neural networks on silicon for the processing of various signals, including multidimensional signals such as images for example. It also allows the efficient realization of conventional signal processing methods. Neural networks are already widely used and can potentially be used in a wide variety of applications, including all devices, systems or processes that use approaches or learning mechanisms to define the function to be performed, as opposed to more traditional in which the actions to be performed are explicitly defined by a "program". A multitude of systems, ranging from the most sophisticated technical or scientific fields to the realms of everyday life are thus concerned. All these applications require ever-increasing performance, particularly in terms of computing power, to achieve increasingly complex functions, adaptability, size and energy consumption. The algorithms used are essential to achieve these performances. The hardware architecture to implement these algorithms must also be taken into account to achieve performance, especially at a time when the frequency growth of processors is stagnant or at least seems to have reached its limits.
[0002] As a first approximation, we can classify the neuronal material architectures along two axes: - A first axis concerns their structure, which can be digital or analog, or even hybrid; - A second axis concerns their specialization vis-à-vis neural networks that can be implemented, the architectures can be specialized in some well defined neural networks, such as RBF (Radial-Basis Function) or Kohonen map, or can be generic, including programmable to allow to implement a wider variety of networks. The types of system addressed by the present patent application are linked to generic circuits, to digital implementation.
[0003] The hardware architectures of neural systems generally comprise basic elementary modules able to implement a set of neurons. In known manner, a neuron of order i in a neuronal system performs a function of the type: f j) wu and E1 being respectively the synaptic weights associated with the neuron and its inputs. The elementary module includes the arithmetic and logical units (ALU) to perform all these neuronal functions. f is usually a nonlinear function. A technical problem to be solved is in particular to efficiently use the silicon on which the neural networks are implanted, in particular to allow optimal use of the storage of weights and other data in the internal memory of the hardware architecture. Another problem is in particular to allow an extensible material realization in number of neurons / synapses (and thus of entries).
[0004] An object of the invention is in particular to overcome the aforementioned drawbacks, for this purpose the invention relates to a signal processing module comprising at least one operational unit incorporating calculation units, input and output interfaces. capable of being connected to a bus and a memory storing data intended for said calculation units, said memory being organized in such a way that each data word is stored per column on several addresses, a column having a width of one bit, the words being transferred in series to said calculation units. In one possible embodiment, each data word is stored by column on several addresses according to an application-dependent order using said data, this set of several addresses includes for example address jumps. The data transfers are carried out for example according to a column per calculation unit.
[0005] In one possible embodiment, the module comprises a routing unit connected between said memory and the operational unit, said routing unit having a number of inputs at least equal to the number of bits of width of said memory, each input being connected to a column and only one, said routing unit routing the data words of said memory to the calculation units, the same word being able to be routed to several calculation units. The routing unit comprises for example at least two other series of inputs / outputs able to be connected to circuits external to said module.
[0006] These inputs / outputs are for example able to be connected in inputs and outputs of the routing unit of another module identical to said module. For example, the routing unit performs all or some of the following operations: bit offset of the data words; Logical operations; - extension of words. The module comprises for example a memory virtualization unit connected on the one hand in writing and reading to said memory and on the other hand to an external memory via a DMA type circuit. The memory virtualization unit performs, for example, reorganization operations of said memory. The reorganization operations of said memory are done for example by duplication or order change of the data between the columns of said memory. The input and output interfaces for example communicate with said bus by a TDMA protocol. Said memory allows for example independent read and write access for the input interfaces, the virtualization unit and the routing unit. Advantageously, the module operates for example according to several independent synchronous zones, the operational unit operating in a first clock domain, the input and output interfaces operating in a second clock domain. The memory virtualization unit operates for example in a third independent clock domain.
[0007] For example, the calculation units execute the operations according to the value of a guard bit assigned to each of said units. The operational unit comprises for example at least 32 calculation units, said memory having a width of 32 bits.
[0008] The module being able to implement a set of neurons it performs for example neural calculations or digital signal processing, said memory storing at least calculation results, coefficients of filters or convolution products and synaptic coefficients.
[0009] The invention also relates to a circuit adapted to implement a neural network, characterized in that it comprises at least one series of signal processing modules adapted to implement a set of neurons such as that described above. The modules signal processing are for example grouped by branches, a branch being formed of a group of modules and a broadcast bus, said modules being connected to said bus, a routing block connected to the broadcast buses of said branches performing at least routing and broadcasting the input and output data of said circuit to and from said branches.
[0010] Other characteristics and advantages of the invention will become apparent with the aid of the description which follows given with regard to the attached drawings which represent: FIG. 1, an example of a neuronal system comprising a series of elementary treatment modules called in FIG. following neuroblocks; FIG. 2, storage of data (synaptic weights, inputs, etc.) in a memory; FIG. 3, a possible mode of operation of a signal processing module according to the invention; - Figure 4, another possible mode of operation of a module according to the invention; FIG. 5, an exemplary possible embodiment of a module according to the invention; FIG. 6, an example of operation of a module according to the invention with several independent synchronizations.
[0011] Figure 1 illustrates by way of example a neural system comprising a series of neuro-blocks. The invention is described by way of example for a signal processing module applied to neural networks, but may be applicable for other types of processing. In the example of FIG. 1, the system 10 comprises 32 neuro-blocks 1. A neuro-block can be considered as the basic element since it is able to implement a set of neurons. As indicated above, a neuron of order i performs a function of the following type: f (1w E1) Wu and E1 being respectively the synaptic weights associated with the neuron and its inputs and f being generally a nonlinear function In the example of implantation of Figure 1, the neuro-blocks 1 are distributed by branches. A branch is composed of several neuro-blocks 1 and a broadcast bus 2 shared by the neuro-blocks connected to this bus. In a configuration with 32 neuro-blocks for example, the neuro-blocks can be divided into 4 branches of 8 neuro-blocks or 8 branches of 4 neuro-blocks. On the other hand, all neuroblocks are for example connected by an interconnection line 4 having the structure of a chained bus. More precisely, the arithmetic and logical units (ALU) of each neuro-block can be connected to this bus. The interconnection line 4 "inter-ALU" thus crosses all the neuro-blocks 1 of the same circuit 10. Each branch is connected to a routing block 3, the exchanges between the different branches being done via this block 3 This routing block 3 also receives input data and transmits circuit output data for example via an input / output data transformation module 6. A direct memory access module 8 (DMA) enables an extension of available memory. It is coupled via buses 14, 15 to an internal memory 9, containing a program, and may be connected to each neuro-block 1, more particularly to the memory management unit of each neuro-block. A control module 11 acts as a centralized control processor. The exemplary neural system of FIG. 1 is used as an example to illustrate a context of use of neuroblocks. A processing module, or neuro-block, according to the invention can of course be applied for other architectures of neural systems. FIG. 2 illustrates a problem that arises for a signal processing module, particularly of the neural network type. More particularly, Figure 2 shows a memory 20 used in a neuro-block. It is an unmarked memory that stores several types of data. It stores, in particular, the weights of the synapses, signal processing filter coefficients, notably producing convolution products or Fourier transforms, final or intermediate calculation results, as well as other possible data. A neuro-block performs a very large number of calculations with potentially variable precisions. For example, in the case of online learning, it is possible to distinguish between the learning phase, in which the synaptic weights are calculated, requiring a high accuracy, for example on 16 bits or 32 bits, and the operational phase which requires less precision, for example on 8 bits. In any case, the variability of the precisions used leads to operations in particular on 4 bits, 8 bits, 16 bits or 32 bits or more and even, on the opposite, on a single bit. Figure 2 shows a simple example where a 4-bit word 22 is stored in the memory 20 at a given address, Ak. In this memory having a width of 32 bits, we see that the space is poorly occupied. This case illustrates an inefficient use of silicon. By way of example, in a case where the addresses of the memory 20 are coded on 12 bits, this one has 4096 addresses. For a 32-bit width, it can then contain 16 kilobytes, or more precisely 16384 bytes. In conventional solutions of the prior art, this available space of 16 kilobytes is for example used at 50%, or less if the data are not exact multiples of the memory width. FIG. 3 illustrates the mode of operation of a neuro-block according to the invention, more particularly FIG. 3 illustrates the storage mode of the data in the memory of the neuro-block and the mode of transfer to the arithmetic units and logical 31 (ALU). Instead of storing the data by addresses as in the case of Figure 2, they are stored on several addresses. The example of FIG. 3 illustrates a preferred case where the data is stored on one bit per address, the whole of the data word being stored on several successive addresses. For example, an 8-bit word W1 is stored on the first bit 301 of the memory between the addresses A0 and A7. Another word W2 of 4 bits is for example stored between the addresses A8 and A11. By considering the memory as a set of lines 38, each corresponding to an address, and columns 39, each having a width of one bit, the column-by-column memory is filled. In other words, the memory is organized such that each data item is stored per column on several successive addresses of the least significant bit to the most significant bit, for example. The filling is thus transposed with respect to a conventional solution where the memory is filled line by line. In the example of Figure 3, the ranks of the bits increase with the addresses. The opposite is possible, the highest address of the word then containing the least significant bit. Moreover, the words are transferred in series to an operational calculation unit 34, for example consisting of a set of ALU 31. A data word is thus identified according to the rank it occupies in width and the addresses that it occupies in column. The word W2 thus occupies the first bit between the addresses Ag and A11.
[0012] The transposed filling of the memory as described above combined with the serial data transfer makes it possible to optimize the available memory space. The storage structure of a module according to the invention, as illustrated in FIG. 3 brings another advantage. In particular, it makes it possible to speed up certain calculations. Because of the "serial" type of storage inside the memory, one word can be read one way or the other. More precisely, the data can be transferred to the calculation units starting from the least significant bit, LSB (Least Significant Bit), or starting from the most significant bit, MSB (Most Significant Bit). Depending on the operations, you can choose a transfer in one direction rather than the other. Thus, for a calculation of the maximum between two binary numbers, it is advantageous from the point of view of the speed of calculation to perform a transfer from the MSB, the comparisons starting with the most significant bit. Indeed, if the first datum has its MSB at 0 and the second its MSB at 1, the computing unit can immediately conclude that the second datum is the greater of the two if the codings are unsigned. On the other hand, for an addition, it is more advantageous to start the transfer by the LSB, for the propagation of the restraint. FIG. 3 illustrates an embodiment where the bits are transferred directly to the ALUs 31, the transfers being effected by a column by ALU for example. In the case of a 32 ALU application with a 32 bit memory width, a bit rank is assigned to each ALU. In the example of Figure 3, the words are stored on several successive addresses in ascending order. They can of course be stored in descending order or in disorder if necessary. In addition, the storage addresses are not necessarily successive, there may indeed be addresses jumps. In fact, order and succession depend in particular on the application.
[0013] FIG. 4 shows a more optimal embodiment where the serial words are transferred via a routing unit 41. This routing unit makes it possible to further improve the use of the memory space. This unit makes it possible in particular to route or broadcast a data word to one or more circuits, in particular to one or more ALUs 31. For example, a synaptic weight stored in the memory 30 can be transferred to several ALUs 31 forming each a computing unit. In particular filters, convolution products or other types of operations specific to neural networks, have data in common. In other words, this data is shared between several operations, for example, the same filter is used on several pixels of the same image in a convolution operation. In the absence of the routing unit 41, the aforementioned synaptic weight should be stored in several places in the memory (for example on each column) in order to be transferred to the ALUs that need it. The broadcast performed by the routing unit thus avoids multiple copies or assignments in the memory 30, a shared datum being able to be stored in a single location in the memory. In addition to the broadcast function described above, the routing unit 41 can perform other functions. This unit 41 can for example also perform the following operations: - Offset of the bits of the data words (in one direction or in the other); for example to facilitate the so-called sliding windows calculation. - Logical operations; - Extension of the words according to different scales by inserting for example one or more "0" between all the bits of a word. The routing unit 41 is for example composed of multiplexers, registers and logic gates in order to perform the various operations of routing and data transformation. These elements are arranged, in known manner, so that the routing and transformation operations between the memory 30 and the operational unit 34 can be performed in a single cycle. FIG. 5 shows an exemplary possible embodiment of a signal processing module according to the invention. The module comprises a memory 30, a routing unit 41 and an operational unit 34 arranged and operating in accordance with the description of FIG. 4. The memory 30 is for example a memory of the RAM type having a capacity of 16 kilobytes for a width of 32 bits. As indicated above, it is especially intended to store various data such as input data, results, intermediate results, coefficients of filters or convolution products for pretreatment as well as synaptic coefficients. As in the previous examples, the operational unit 34 has 32 ALUs. It performs all the required calculations within the neuroblock, particularly to perform both pretreatments and neuronal treatments. The operations that can be implemented are for example the following: - Addition and subtraction; - Multiplication and division; 30 - Calculation of minimum and maximum; - Numerical calculation by coordinate rotation (CORDIC) for trigonometric or hyperbolic functions. ALUs can operate on operands from different sources. A first data source is of course the memory 30, via the routing unit 41. The operands can also come from other modules 1 when the module is implemented in a system, for example a neural system of the type of FIG. 1. The transfers between the modules can be done for example by the routing unit 41. The latter performs, for example, in the rank n module a local interconnection 5 with the neighboring modules, for example with the n-2 rank modules. , n-1, n + 1 and n + 2. The module 1 is also able to be connected to an external bus 2, for example of the type of the broadcast bus of the neural system of FIG. 1, via input and output interfaces 52, 53 which will be described later. .
[0014] The neuro-block comprises for example a unit 51 of memory virtualization and management of the topology of neural networks (VCU) allowing the virtualization of the memory of the neuro-block and the implementation of different topologies of neural networks. . This unit 51 has a direct and independent access to the memory 30, read and write.
[0015] The VCU unit 51 can also provide the global connectivity between the ALUs 34. For this purpose, it has a certain number of operators making it possible to rearrange the data stored in the memory 30, by duplication or order change for example (reading one data and write to another address). It also makes it possible to rearrange the data in memory, for example to replace data that is no longer useful with useful data, allowing, for example, the routing unit and the set of aluminum 34 to make the same sequence. operations with the same operand addresses in the memory 30, but with new useful data. The data thus reorganized is ready to be used by the routing unit 41 and the operational unit 34. The VCU unit 51 is also connected to a direct memory access module (DMA) outside the neuro-block via a 32-bit bus for example. It can thus read entire blocks in the memory 30 to send them to an external memory or write entire blocks in the memory 30 from an external memory. The VCU unit 51 30 thus makes it possible to virtualize the memory containing the synaptic weights, it allows in fact a virtualization of the synaptic weights outside the neuroblock. The neuro block 1 comprises for example an input module 52 and an output module 53 for connecting the neuro-block to the diffusion bus. In particular, they manage the asynchronism (or at least the absence of synchronization) between the different modules connected via the broadcast bus 2. In the case of application of FIG. 1, the various modules are in particular the others. neuro-blocks and, in a silicon implementation, the fact of not forcing the synchronization of the neuro-blocks (that is to say not having a fully synchronous circuit) will make it possible to gain in operating frequency and simplify the realization modular regardless of the number of neuroblocks. The input module 52 has a unique address specific to the neuro-block but possibly re-assignable. It notably monitors the control words of the messages passing over the broadcast bus: if the neuro-block identifier located in the message header (Neuro-block ID) corresponds to the own address of the input module 52 or if this identifier corresponds to a broadcast operation, the module captures all the data of the message and stores them in the memory 30 at addresses previously given by the internal program of the neuro-block according to the mode of addressing described relative to Figure 3. The output module 53 is equipped for example with a FIFO memory that manages the wait because of the TDMA protocol, if it is for example used to access in particular the broadcast bus. Depending on the type of data, the module 53 can generate a control flit. It stores locally n sequences of 32 bits for example before sending them on the broadcast bus, according to the protocol used. Advantageously, the TDMA protocol can be combined with the use of the memory 30 by the different resources 51, 41, 52. In fact, on the one hand, the TDMA protocol makes it possible to divide the time into slots, each internal module to the neuro block having a dedicated slot (for example, a first time slot being reserved for the VCU 51, a second to the routing system 41 connected to the ALUs 34, a third to the input module 52, etc.). A block 54, for example a SIMD controller (Single Instruction Multiple Data) with 32 channels, carries out the control of transfers within the neuro-block according to a conventional process, known to those skilled in the art. In addition, each ALU 31 component block 34 is for example controlled by a guard bit, the state of which depends on the data to be processed. This guard bit can also be controlled by the block 54. This guard bit allows a conditional execution of the operations by the ALUs 31, an ALU executing or not an operation sent by the block 54 as a function of the value of the guard bit ( this guard bit makes it possible to ignore the result of the operation if necessary for example). Figure 6 illustrates the different synchronization domains, or clock domains, within a neuro-block. These different synchronization domains 61, 62, 63 characterize the decoupling of computations, "long distance" communications and virtualization. In other words, a first synchronization frequency 61 clocking the calculations performed by the operational unit 34, a second synchronization frequency 62 clocking the communications towards the broadcast bus via the input / output modules 52, 53 and a third synchronization frequency 63 rate the operation of the unit 51 of memory virtualization. These three synchronization domains are independent and the synchronization frequencies can vary over time, for example to adapt to the processing speed appropriate to each domain over time. A module according to the invention allows in particular an efficient implementation of the processing and networks on silicon. The serial architecture inside the module and the storage organization in the memory 30 allows a variable accuracy to the bit while optimizing the memory space occupied as well as the calculation time. The invention thus makes it possible to use all the storage resources. The decoupling provided by the different synchronization domains notably brings the following advantages: increase of the operating frequency of the circuit during the implementation, possible variation of the operating frequencies of the neuro-blocks independently to optimize the energy consumption, etc. And it makes it possible to decouple the programming of the different parts in the different synchronization domains, thus facilitating the development of applications and the scalability of the proposed architecture according to the different possible realizations (variation of the number of neuro-blocks, the realization of the communication protocols of units 52 and 53 etc.).
权利要求:
Claims (20)
[0001]
REVENDICATIONS1. Signal processing module, comprising at least one operating unit (34) incorporating calculation units (31), input and output interfaces (52, 53) adapted to be connected to a bus and a memory (30) storing data for said computing units, characterized in that said memory (30) is organized such that each data word is stored by column (39) on a plurality of addresses, a column having a width of one bit, the words being transferred in series to said calculation units (31).
[0002]
Signal processing module according to claim 1, characterized in that each data word is stored by column (39) on several addresses in an application-dependent order using said data.
[0003]
3. signal processing module according to claim 2, characterized in that said multiple addresses comprise address jumps.
[0004]
4. Signal processing module according to any one of the preceding claims, characterized in that the data transfers are carried out according to a column (39) per calculation unit (31).
[0005]
Signal processing module according to any one of the preceding claims, characterized in that it comprises a routing unit (41) connected between said memory (30) and the operational unit (34), said unit 25 of routing having a number of inputs at least equal to the number of bits of width of said memory (30), each entry being connected to a column and only one, said routing unit routing the data words of said memory to the units of calculation (31), the same word being able to be routed to several calculation units (31). 30
[0006]
6. signal processing module according to claim 5, characterized in that the routing unit comprises the hands two other series of inputs / outputs adapted to be connected to circuits external to said module (1).
[0007]
7. signal processing module according to claim 6, characterized in that said inputs / outputs are adapted to be connected in inputs and outputs of the routing unit of another module identical to said module (1).
[0008]
8. signal processing module according to any one of claims 5 to 7, characterized in that the routing unit (41) performs all or part of the following operations: - bit shifting data words; - logical operations; - extension of words.
[0009]
9. signal processing module according to any one of the preceding claims, characterized in that it comprises a unit (51) of virtualization 15 of the memory connected on the one hand in writing and reading to said memory (30) and on the other hand to an external memory via a DMA type circuit.
[0010]
10. The signal processing module according to claim 9, characterized in that the memory virtualization unit (51) performs reorganization operations of said memory (30).
[0011]
11. Signal processing module according to claim 10, characterized in that the reorganization operations of said memory (30) are done by duplication or order change of the data between the columns (39) of said memory (30). .
[0012]
12. Signal processing module according to any one of the preceding claims, characterized in that the input and output interfaces (52, 53) communicate with said bus (2) by a TDMA protocol. 30
[0013]
Signal processing module according to claims 1, 8 and 9, characterized in that said memory allows independent read and write accesses for the input interfaces (52), the virtualization unit (51) and the routing unit (41). 35
[0014]
Signal processing module according to one of the preceding claims, characterized in that it operates according to a plurality of independent synchronous zones (61, 62, 63), the operating unit (34) operating in a first domain of clock (61), the input (52) and output (53) interfaces operating in a second clock domain (62).
[0015]
The signal processing module according to claim 14 and any one of claims 9 to 11, characterized in that the memory virtualization unit (51) operates in a third independent clock domain (63). .
[0016]
Signal processing module according to one of the preceding claims, characterized in that the calculation units (31) perform the operations according to the value of a guard bit allocated to each of said units (31). .
[0017]
Signal processing module according to one of the preceding claims, characterized in that the operational unit (34) comprises at least 32 calculation units (31), said memory (30) having a width of 32. bits.
[0018]
18. Signal processing module according to any one of the preceding claims, characterized in that being able to implement a set of neurons it performs neural calculations or digital signal processing, said memory storing at least results of calculations, coefficients of filters or convolution products and synaptic coefficients. 30
[0019]
19. Circuit capable of implementing a neural network, characterized in that it comprises at least one series of signal processing modules (1) according to claim 18.
[0020]
Circuit according to Claim 19, characterized in that the signal processing modules (1) are grouped by branches, a branch consisting of a group of modules (1) and a diffusion bus (2). said modules (1) being connected to said bus (2), a routing block (3) connected to the broadcast buses of said branches performing at least the routing and the broadcasting of the input and output data of said circuit to and from said branches .
类似技术:
公开号 | 公开日 | 专利标题
EP3084588B1|2017-10-25|Signal processing module, especially for a neural network and a neuronal circuit
EP0154341B1|1988-12-28|Discrete cosine transform processor
EP0154340A1|1985-09-11|Inverse discrete cosine transform processor
EP0020202A1|1980-12-10|Multiprocessing system for signal treatment
BE897441A|1984-02-02|ASSOCIATIVE CALCULATOR FOR FAST MULTIPLICATION
FR2525787A1|1983-10-28|MULTIMICROPROCESSOR SYSTEM
WO2015049183A1|2015-04-09|Electronic circuit, in particular suitable for the implementation of a neural network, and neural system
EP0184494A1|1986-06-11|System for the simultaneous transmission of data blocks or vectors between a memory and one or a plurality of data processing units
EP0558125B1|1997-10-29|Neural processor with distributed synaptic cells
EP2332067A1|2011-06-15|Device for the parallel processing of a data stream
EP0437876B1|1996-03-13|Programmable serial multiplier
EP0262032B1|1991-11-27|Binary adder having a fixed operand, and a parallel/serial multiplier comprising such an adder
EP0259231B1|1992-03-04|Device for the determination of the digital transform of a signal
FR2667176A1|1992-03-27|METHOD AND CIRCUIT FOR ENCODING A DIGITAL SIGNAL FOR DETERMINING THE SCALAR PRODUCT OF TWO VECTORS AND CORRESPONDING TCD PROCESSING.
EP0554177A1|1993-08-04|Associative memory architecture
WO2017108398A1|2017-06-29|Electronic circuit, particularly for the implementation of neural networks with multiple levels of precision
WO2000026790A1|2000-05-11|Memory with vectorial access
FR2683349A1|1993-05-07|BINARY RESISTANCE NETWORK AND USE THEREOF FOR LABELING COMPONENT COMPONENTS OF DIGITAL IMAGES IN ARTIFICIAL VISION.
FR2606186A1|1988-05-06|CALCULATION PROCESSOR COMPRISING A PLURALITY OF SERIES-CONNECTED STAGES, CALCULATOR AND CALCULATION METHOD USING THE SAME
EP0254628B1|1990-11-22|Digital signal-processing circuit performing a cosine transform
FR2487545A1|1982-01-29|APPARATUS FOR CONVERTING DECIMAL NUMBERS BINARY CODES IN BINARY NUMBERS
EP0435399A1|1991-07-03|Arithmetic processing unit to be associated with a microprocessor central unit
FR2598532A1|1987-11-13|PROCESSOR FOR CALCULATING DISCRETE FOURIER TRANSFORMATION COMPRISING AN ONLINE TEST DEVICE
FR2741448A1|1997-05-23|Fourier transform spectral analyser with fast response time
WO2003093973A1|2003-11-13|Montgomery multiplication
同族专利:
公开号 | 公开日
WO2015090885A1|2015-06-25|
EP3084588B1|2017-10-25|
US20160292566A1|2016-10-06|
US11017290B2|2021-05-25|
FR3015068B1|2016-01-01|
EP3084588A1|2016-10-26|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
US5023833A|1987-12-08|1991-06-11|California Institute Of Technology|Feed forward neural network for unary associative memory|
US5063521A|1989-11-03|1991-11-05|Motorola, Inc.|Neuram: neural network with ram|
WO1991019259A1|1990-05-30|1991-12-12|Adaptive Solutions, Inc.|Distributive, digital maximization function architecture and method|
EP0558125A1|1992-02-26|1993-09-01|Laboratoires D'electronique Philips S.A.S.|Neural processor with distributed synaptic cells|
EP0694856A1|1994-07-28|1996-01-31|International Business Machines Corporation|Daisy chain circuit for serial connection of neuron circuits|
US5129092A|1987-06-01|1992-07-07|Applied Intelligent Systems,Inc.|Linear chain of parallel processors and method of using same|
US5014235A|1987-12-15|1991-05-07|Steven G. Morton|Convolution memory|
US5014236A|1988-01-29|1991-05-07|International Business Machines Corporation|Input/output bus expansion interface|
JP2647330B2|1992-05-12|1997-08-27|インターナショナル・ビジネス・マシーンズ・コーポレイション|Massively parallel computing system|
GB9223226D0|1992-11-05|1992-12-16|Algotronix Ltd|Improved configurable cellular array |
US5717947A|1993-03-31|1998-02-10|Motorola, Inc.|Data processing system and method thereof|
US5557734A|1994-06-17|1996-09-17|Applied Intelligent Systems, Inc.|Cache burst architecture for parallel processing, such as for image processing|
JP3722619B2|1997-07-10|2005-11-30|沖電気工業株式会社|Memory device and access control method thereof|
GB9902115D0|1999-02-01|1999-03-24|Axeon Limited|Neural networks|
US7444531B2|2001-03-05|2008-10-28|Pact Xpp Technologies Ag|Methods and devices for treating and processing data|
JP2003168287A|2001-07-24|2003-06-13|Toshiba Corp|Memory module, memory system, and data transfer method|
KR100543449B1|2003-04-11|2006-01-23|삼성전자주식회사|Semiconductor memory device capable of accessing all memory cells by relative address manner|
KR100812225B1|2005-12-07|2008-03-13|한국전자통신연구원|Crossbar switch architecture for multi-processor SoC platform|
US7752417B2|2006-06-05|2010-07-06|Oracle America, Inc.|Dynamic selection of memory virtualization techniques|
US9384168B2|2013-06-11|2016-07-05|Analog Devices Global|Vector matrix product accelerator for microprocessor integration|FR3045893B1|2015-12-21|2017-12-29|Commissariat Energie Atomique|ELECTRONIC CIRCUIT, PARTICULARLY ABLE TO IMPLEMENTATION OF NEURON NETWORKS AT SEVERAL LEVELS OF PRECISION.|
CN108701236B|2016-01-29|2022-01-21|快图有限公司|Convolutional neural network|
US10497089B2|2016-01-29|2019-12-03|Fotonation Limited|Convolutional neural network|
FR3050846B1|2016-04-27|2019-05-03|Commissariat A L'energie Atomique Et Aux Energies Alternatives|DEVICE AND METHOD FOR DISTRIBUTING CONVOLUTION DATA OF A CONVOLUTIONAL NEURON NETWORK|
US10395165B2|2016-12-01|2019-08-27|Via Alliance Semiconductor Co., Ltd|Neural network unit with neural memory and array of neural processing units that collectively perform multi-word distance rotates of row of data received from neural memory|
CN107423816B|2017-03-24|2021-10-12|中国科学院计算技术研究所|Multi-calculation-precision neural network processing method and system|
US10558430B2|2018-04-17|2020-02-11|Fotonation Limited|Neural network engine|
US20190386895A1|2018-06-13|2019-12-19|At&T Intellectual Property I, L.P.|East-west traffic monitoring solutions for the microservice virtualized data center lan|
法律状态:
2015-12-31| PLFP| Fee payment|Year of fee payment: 3 |
2016-12-29| PLFP| Fee payment|Year of fee payment: 4 |
2018-01-02| PLFP| Fee payment|Year of fee payment: 5 |
2019-09-27| ST| Notification of lapse|Effective date: 20190906 |
优先权:
申请号 | 申请日 | 专利标题
FR1362859A|FR3015068B1|2013-12-18|2013-12-18|SIGNAL PROCESSING MODULE, IN PARTICULAR FOR NEURONAL NETWORK AND NEURONAL CIRCUIT|FR1362859A| FR3015068B1|2013-12-18|2013-12-18|SIGNAL PROCESSING MODULE, IN PARTICULAR FOR NEURONAL NETWORK AND NEURONAL CIRCUIT|
EP14803126.3A| EP3084588B1|2013-12-18|2014-11-27|Signal processing module, especially for a neural network and a neuronal circuit|
US15/037,659| US11017290B2|2013-12-18|2014-11-27|Signal processing module, especially for a neural network and a neuronal circuit|
PCT/EP2014/075729| WO2015090885A1|2013-12-18|2014-11-27|Signal processing module, especially for a neural network and a neuronal circuit|
[返回顶部]